Dominance properties for Divisible MapReduce Computations

نویسندگان

  • J. Berlińska
  • M. Drozdowski
چکیده

In this paper we analyze MapReduce distributed computations as divisible load scheduling problem. The two operations of mapping and reducing can be understood as two divisible applications with precedence constraints. A divisible load model is proposed, and schedule dominance properties are analyzed. We investigate dominant schedule structures for MapReduce computations. To our best knowledge this is the first time that processing divisible loads with precedence constraints is considered on the grounds of divisible load theory.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Security and Privacy Aspects in MapReduce on Clouds: A Survey

MapReduce is a programming system for distributed processing large-scale data in an efficient and fault tolerant manner on a private, public, or hybrid cloud. MapReduce is extensively used daily around the world as an efficient distributed computation tool for a large class of problems, e.g., search, clustering, log analysis, different types of join operations, matrix multiplication, pattern ma...

متن کامل

Google's MapReduce programming model - Revisited

Google’s MapReduce programming model serves for processing and generating large data sets in a massively parallel manner (subject to a suitable implementation of the model). We deliver the first rigorous description of the model. To this end, we reverse-engineer the seminal MapReduce paper and we capture our observations, assumptions and recommendations as an executable specification. We also i...

متن کامل

MapReduce with Deltas

The MapReduce programming model is extended conservatively to deal with deltas for input data such that recurrent MapReduce computations can be more efficient for the case of input data that changes only slightly over time. That is, the extended model enables more frequent re-execution of MapReduce computations and thereby more up-to-date results in practical applications. Deltas can also be pu...

متن کامل

A Computational Model for Mapreduce Job Flow

Massive quantities of data are today processed using parallel computing frameworks that parallelize computations on large distributed clusters consisting of many machines. Such frameworks are adopted in big data analytic tasks as recommender systems, social network analysis, legal investigation that involve iterative computations over large datasets. One of the most used framework is MapReduce,...

متن کامل

RDFPath: Path Query Processing on Large RDF Graphs with MapReduce

The MapReduce programming model has gained traction in different application areas in recent years, ranging from the analysis of log files to the computation of the RDFS closure. Yet, for most users the MapReduce abstraction is too low-level since even simple computations have to be expressed as Map and Reduce phases. In this paper we propose RDFPath, an expressive RDF path query language geare...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009